fix: propagate model/variant/thinking to sweep aggregate summary#59
Conversation
The aggregate sweep rollup step (run_sweep_summary) was using raw
'opencode run' without resolving CodeCome's model configuration,
which meant CODECOME_MODEL, CODECOME_MODEL_VARIANT, OPENCODE_ARGS,
and codecome.yml agent pinning were all ignored for the aggregate
step while the per-file sweeps honoured them correctly.
Now resolve_runtime_config('auditor') is called and the resolved
model, variant, and thinking flags are passed explicitly to the
raw opencode run command, matching the per-file sweep behaviour.
Also prints the resolution source in the banner for visibility.
Adds three regression tests covering model+variant, thinking-only,
and nothing-resolved scenarios.
|
Warning Review limit reached
More reviews will be available in 31 minutes and 36 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthrough
ChangesDynamic model flag propagation in sweep summary
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Coverage Report
Generated by pytest-cov on |
Greptile SummaryThis PR fixes
Confidence Score: 5/5Safe to merge — the change is a focused, self-contained bug fix with no side effects on the per-file sweep path. The fix is minimal: one new import, one resolver call, and three conditional flag appends, all guarded by truthiness checks. The three new tests stub both the resolver and the subprocess call, covering every branch (model+variant, thinking-only, nothing resolved). No pre-existing behaviour is altered — the prompt text is still correctly appended last, and dry_run gating at the call site in main() remains unchanged. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["run_sweep_summary(files, summaries)"] --> B["resolve_runtime_config('auditor')"]
B --> C{rc.model?}
C -- yes --> D["extend(['--model', rc.model])"]
C -- no --> E{rc.variant?}
D --> E
E -- yes --> F["extend(['--variant', rc.variant])"]
E -- no --> G{rc.thinking_on?}
F --> G
G -- yes --> H["append('--thinking')"]
G -- no --> I["append(prompt_text)"]
H --> I
I --> J["subprocess.run(command, cwd=ROOT)"]
Reviews (2): Last reviewed commit: "chore: address review feedback - drop de..." | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tools/run-sweep.py (1)
194-225: 🏗️ Heavy liftMove aggregate summary runtime-resolution logic out of the root script and delegate to a package module.
This change adds more implementation behavior directly in
tools/run-sweep.py; per repo policy, root-level tools should stay delegation-only wrappers. Please moverun_sweep_summary’s config resolution + command assembly into a package module and have this script call it.As per coding guidelines, “Standalone scripts at the tools/ root … should be thin wrappers that delegate to their respective packages. Implementation must live in the package, not the script.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/run-sweep.py` around lines 194 - 225, Move the implementation logic from the run_sweep_summary function in tools/run-sweep.py to a package module. Specifically, extract the config resolution with resolve_runtime_config, the command assembly logic that builds the list with opencode run and the optional model/variant/thinking arguments, and the subprocess.run execution into a new function within the package. Have run_sweep_summary in the root script become a thin wrapper that calls this new package function, keeping only the print statements and file path handling in the script itself.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_run_sweep.py`:
- Line 267: The lambda stubs for resolve_runtime_config at lines 267, 309, and
344 ignore the agent input parameter, which means they cannot detect if the
resolver is being called with an incorrect role. Replace each of these three
stubs (in tests/test_run_sweep.py at lines 267, 309, and 344) with
implementations that assert the agent parameter contains the auditor role before
returning fake_rc. This will ensure that role-routing regressions are caught
during testing instead of going undetected.
---
Nitpick comments:
In `@tools/run-sweep.py`:
- Around line 194-225: Move the implementation logic from the run_sweep_summary
function in tools/run-sweep.py to a package module. Specifically, extract the
config resolution with resolve_runtime_config, the command assembly logic that
builds the list with opencode run and the optional model/variant/thinking
arguments, and the subprocess.run execution into a new function within the
package. Have run_sweep_summary in the root script become a thin wrapper that
calls this new package function, keeping only the print statements and file path
handling in the script itself.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: abbe8220-71c1-42b9-8866-d5f5dfbfe1e8
📒 Files selected for processing (2)
tests/test_run_sweep.pytools/run-sweep.py
…agent role in resolver stubs
pruiz
left a comment
There was a problem hiding this comment.
Addressing remaining review items:
@coderabbitai nitpick (move to package module): Deferred.
The suggestion to extract run_sweep_summary into a package module is architecturally valid and consistent with tools/AGENTS.md. However, tools/run-sweep.py is already a 280-line implementation-heavy script (argparse, file resolution, prompt building, subprocess orchestration) — it was never a thin wrapper. The 10 lines added here are fully consistent with the existing pattern. A proper package extraction of the sweep runner belongs in a separate architectural refactor PR, not as a blocking concern for this focused bug fix.
@coderabbitai inline (assert agent role): Fixed in bc4774a. All three lambda stubs now assert agent == "auditor".
@greptile-apps inline (dead monkeypatch param): Fixed in bc4774a. Removed from signature and call sites.
Problem
The aggregate sweep summary step (
run_sweep_summaryintools/run-sweep.py) was using rawopencode runwithout resolving CodeCome's model configuration. This meant:CODECOME_MODEL/CODECOME_MODEL_VARIANTenv varsOPENCODE_ARGS='--model ... --variant ...'codecome.ymlagent pinning forauditor...were all ignored for the aggregate step, while per-file sweep runs (which go through
tools/run-agent.py) honoured them correctly.When the pinned model is unsupported by the user's account (e.g.
gpt-5.5-cyber-previewwith a ChatGPT account), per-file sweeps work fine but the aggregate step fails with a bad request error because rawopencode runfalls back to the wrong model.Fix
run_sweep_summary()now callsresolve_runtime_config("auditor")(same resolution stack as per-file runs) and passes explicit--model,--variant, and--thinkingflags to the rawopencode runcommand. It also prints the resolution source in the banner for visibility.Tests
Three new tests in
TestRunSweepSummaryModelPropagation:test_passes_model_and_variant_flags— verifies--modeland--variantare passed when resolvedtest_passes_thinking_flag_when_on— verifies--thinkingis passed when thinking is ontest_no_flags_when_nothing_resolved— verifies no extra flags when nothing is resolvedVerification
mock_llm_parityfailures fromopencode serve 1.17.7timing bug)Summary by CodeRabbit